Integrate Automated QDQ placement tool - Part 2 #702

willg-nv · 2025-12-17T06:29:51Z

What does this PR do?

Type of change: new feature

Overview: This PR integrate automated Q/DQ placement tool to ModelOpt. This PR is 2/4 parts of the cahnges.

Part 1: #701
Part 2: #702
Part 3: #703
Part 4: #704

This PR contains the following changes:

Implement RegionPattern to represent the topology structure of Regions. InsertionPoints are also defined on RegionPattern. Regions with same pattern are optimized at the same time
Implement RegionSearch class to divide ONNX graph into small regions
RegionSearch python file also provides an entry point to print out the region structures.
Unit tests for new classse.

Usage

python -m modelopt.onnx.quantization.autotune.region_search --model model.onnx --verbose

Example output:

    ├─ Region 212 (Level 0, Type: COMPOSITE)
    │  ├─ Direct nodes: 0
    │  ├─ Total nodes (recursive): 9
    │  ├─ Children: 1
    │  ├─ Inputs: 3 tensors
    │  │    - xxx
    │  │    - xxx
    │  │    - xxx
    │  └─ Outputs: 1 tensors
    │       - xxx
    │
    │  Child regions:
    │
      ├─ Region 209 (Level 2, Type: LEAF) 
      │  ├─ Direct nodes: 9
      │  ├─ Total nodes (recursive): 9
      │  ├─ Children: 0
      │  ├─ Inputs: 11 tensors
      │  │    - xxx

Testing

Implemented unit tests for new classes. All unit tests could get pass locally.

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: Yes
Did you add or update any necessary documentation?: No, document change will be in part 4.
Did you update Changelog?: No. Change log will be included in part 4.

Additional Information

copy-pr-bot · 2025-12-17T06:29:55Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

willg-nv · 2025-12-22T01:47:10Z

Hi @ajrasane , could you help me review this PR, thanks!

modelopt/onnx/quantization/autotune/common.py

modelopt/onnx/quantization/autotune/region_pattern.py

modelopt/onnx/quantization/autotune/__init__.py

modelopt/onnx/quantization/autotune/region_pattern.py

modelopt/onnx/quantization/autotune/region_search.py

tests/unit/onnx/quantization/autotune/test_pattern_cache.py

gcunhase · 2026-01-09T00:37:31Z

modelopt/onnx/quantization/qdq_utils.py

+    quantized_tensors = set()
+
+    for node in onnx_model.graph.node:
+        if node.op_type == "QuantizeLinear":


If --dq_only is enabled, there may only be the DQ node indicating that a tensor is being quantized. Please verify that those cases are supporting with this function.

See

Model-Optimizer/modelopt/onnx/quantization/__main__.py

Line 210 in 307fe71

"--dq_only",

ajrasane · 2026-01-12T23:45:44Z

modelopt/onnx/quantization/autotune/region_search.py

+from modelopt.onnx.quantization.graph_utils import get_tensor_consumer_node_indices
+
+# Module logger
+logger = logging.getLogger(__name__)


Could you use the logger created here for all the logging?
https://github.com/NVIDIA/Model-Optimizer/blob/727da95a9188aaeef6872a61acae9f1ffae844f6/modelopt/onnx/logging_config.py

ajrasane · 2026-01-13T00:10:36Z

modelopt/onnx/quantization/autotune/region_search.py

+        divergent_outputs = [
+            out.name for out in node.outputs if self._is_tensor_divergent(out.name)
+        ]
+        is_divergent = len(divergent_outputs) > 0


This can be simplified to:

is_divergent = any(self._is_tensor_divergent(out.name) for out in node.outputs)

ajrasane · 2026-01-13T00:15:39Z

modelopt/onnx/quantization/autotune/region_search.py

+                for next_node_idx in self.tensor_users_map[output.name]:
+                    if next_node_idx not in reachable:
+                        reachable[next_node_idx] = distance + 1
+                        queue.append((next_node_idx, distance + 1))


nit: can we skip adding the nodes to the queue if the distance + 1 < maxsteps?

I think no need to add this extra check. They will be skipped at Line 285 when they are poped.

if distance >= max_steps: continue

ajrasane · 2026-01-13T00:18:28Z

modelopt/onnx/quantization/autotune/region_search.py

+        2. All nodes between divergence and convergence
+
+        **Algorithm:**
+        1. Identify all branches from the divergent node


Is it a mandatory criteria that a region must start with a divergent node and end with a convergent node?

a region must start with a divergent node
Yes, when linear probe reaches a divergent node, RegionSeach always tries to create a new region.

a region must end with a convergent node
If the convergent node is too far (>= 10 steps), RegionSeach will treat current divergent node as orphane, and tries to probe from its output branches.

ajrasane · 2026-01-13T00:33:14Z

modelopt/onnx/quantization/autotune/region_search.py

+
+            # Share the tensor users map from Phase 1 to avoid recomputation.
+            # This map is expensive to build and is shared across all refinements.
+            region_builder.tensor_users_map = region_partitioner.tensor_users_map


Can we also share the forward_reachable_nodes map form Phase 1 to avoid recomputation?

modelopt/onnx/quantization/autotune/region_search.py

Signed-off-by: Will Guo <[email protected]>

willg-nv requested a review from a team as a code owner December 17, 2025 06:29

willg-nv requested a review from ajrasane December 17, 2025 06:29

willg-nv changed the title ~~Dev willg integrate auto qdq placement part2~~ Integrate Automated QDQ placement tool - Part 2 Dec 17, 2025

This was referenced Dec 17, 2025

Integrate Automated QDQ placement tool - Part 3 #703

Open

Integrate Automated QDQ placement tool - Part 4 #704

Open

Integrate Automated QDQ placement tool - Part 1 #701

Open

willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part2 branch from 3f7ff31 to d3a6765 Compare December 31, 2025 02:16

ajrasane reviewed Jan 8, 2026

View reviewed changes

willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part2 branch 2 times, most recently from 616285d to c95939a Compare January 8, 2026 08:35

gcunhase reviewed Jan 8, 2026

View reviewed changes

modelopt/onnx/quantization/autotune/__init__.py Outdated Show resolved Hide resolved

gcunhase reviewed Jan 9, 2026

View reviewed changes

modelopt/onnx/quantization/autotune/region_pattern.py Outdated Show resolved Hide resolved

gcunhase reviewed Jan 9, 2026

View reviewed changes

modelopt/onnx/quantization/autotune/region_pattern.py Outdated Show resolved Hide resolved

gcunhase reviewed Jan 9, 2026

View reviewed changes

modelopt/onnx/quantization/autotune/region_search.py Outdated Show resolved Hide resolved

gcunhase reviewed Jan 9, 2026

View reviewed changes

modelopt/onnx/quantization/autotune/region_search.py Show resolved Hide resolved

gcunhase reviewed Jan 9, 2026

View reviewed changes

tests/unit/onnx/quantization/autotune/test_pattern_cache.py Outdated Show resolved Hide resolved

gcunhase reviewed Jan 9, 2026

View reviewed changes

willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part2 branch 3 times, most recently from 4468ca2 to bc87ca7 Compare January 9, 2026 05:02

ajrasane reviewed Jan 13, 2026

View reviewed changes

Integrate Automated QDQ placement tool - part 2

6f809d7

Signed-off-by: Will Guo <[email protected]>

willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part2 branch from bc87ca7 to 6f809d7 Compare January 15, 2026 08:13

Integrate Automated QDQ placement tool - Part 2 #702

Are you sure you want to change the base?

Integrate Automated QDQ placement tool - Part 2 #702

Conversation

willg-nv commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Dec 17, 2025

Uh oh!

willg-nv commented Dec 22, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

willg-nv commented Dec 17, 2025 •

edited

Loading